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Abstract. In 1888, Hilbert proved that every nonnegative quartic form 
/ = f(x, y, z) with real coefficients is a sum of three squares of quadratic 
forms. His proof was ahead of its time and used advanced methods 
from topology and algebraic geometry. Up to now, no elementary proof 
is known. Here we present a completely new approach. Although our 
proof is not easy, it uses only elementary techniques. As a by-product, 
it gives information on the number of representations / = p\ + pi + pi 
of / up to orthogonal equivalence. We show that this number is 8 for 
generically chosen /, and that it is 4 when / is chosen generically with 
a real zero. Although these facts were known, there was no elementary 
approach to them so far. 



Introduction 

In 1888, David Hilbert published an influential paper [3] which became 
fundamental for real algebraic geometry, and which remains an inspiring 
source for research even today. It addresses the problem whether a real form 
(homogeneous polynomial) J(xq, . . . , x n ) which takes nonnegative values on 
all of R n+1 is necessarily a sum of squares of real forms. Hilbert proves that 
the answer is negative in general. As is well-known, his results go much 
beyond this fact and contain a surprising positive aspect as well. Namely, 
for any pair (n, d) of integers with n > 2 and even d > 4, except for (n, d) = 
(2,4), he shows that there exists a nonnegative form of degree d in n + 1 
variables which is not a sum of squares of polynomials. In the exceptional 
case, however, he proves that every nonnegative ternary quartic form is a 
sum of three squares of real quadratic forms. 

It is the existence of a representation / = p\ + p| + p| in this exceptional 
case that is the subject of the present article. Hilbert's original proof is brief 
and elegant, and it is ahead of its time in its topological arguments. For his 
contemporaries it must have been hard to grasp. Even today it is not easy 
to read, and it leaves a number of details to be filled in. Several authors have 
given fully detailed accounts of Hilbert's proof in recent years. We mention 
the approach due to Cassels, published in Rajwade's book ([8] chapter 7), 
and the two articles by Rudin [9] and Swan [11] . These approaches also 
show some characteristic differences. 
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One of the first approaches to Hilbert's theorem along elementary and 
explicit lines was carried out by Powers and Reznick in [6], where complete 
answers were given in certain special cases. We would also like to point out 
the recent preprint [5] by Plaumann, Sturmfels and Vinzant which studies 
the computational side of Hilbert's theorem, and which contains a beautiful 
blend of the 19th century mathematics of ternary quartics. 

So far, there seems to exist essentially only one proof different from 
Hilbert's. It comes out as a by-product of the quantitative analysis made in 
[7] and [10]. These papers had a different goal, namely to count the number 
of essentially distinct ways in which a positive semidefinite (or psd, for short) 
ternary quartic / can be written as a sum of three squares. The case where 
the plane projective curve / = is non-singular is done in [7j, the general 
irreducible case is in [10J. Both papers, and in particular the second, are 
using tools of modern algebraic geometry and can certainly not be called 
elementary. 

We are convinced that Hilbert's original proof from [3j cannot claim an 
elementary character either. This can be seen from the following sketchy 
overview of its main steps: 



(1) The set of sums of three squares of quadratic forms is closed inside 
the space of all quartic forms. Therefore it suffices to prove the 
existence of a representation for all forms in some open dense subset 
of the psd forms, for example for all nonsingular such forms. 

(2) Hilbert proves that the map (pi,P2,P3) ^ Y^j=iP 2 j (fro m triples 
of real quadratic forms to quartic forms) is submersive (that is, its 
tangent maps are surjective), when restricted to the open set of 
triples for which the curve YljPj = is nonsingular. His elegant 
argument needs some non-trivial tool from algebraic geometry, like 
Max Noether's AF + BG theorem. 

(3) When the real form / is strictly positive definite and singular, the 
curve / = has at least two different (complex conjugate) singular 
points. 

(4) The locus of quartic forms / for which the curve / = has at least 
two different singularities has codimension > 2 inside the space of 
all quartic forms. 

(5) Removing a subspace of codimension > 2 from a connected topolog- 
ical space leaves the remaining space connected. Hence, by (3) and 
(4), the space of nonsingular positive forms is (path) connected. 

(6) There exist nonsingular positive forms which are sums of three squares, 
like /(°) =x 4 + y i + z 4 . 

(7) Given an arbitrary nonsingular positive form / there exists, by (5), 
a path /w, < t < 1, joining /W = / to a sum of three squares 
/(°) such that is nonsingular and positive for every < t < 1. 



AN ELEMENTARY PROOF OF HILBERT'S THEOREM 



3 



(8) Using (1) and (2), and by the implicit function theorem, the repre- 
sentation (6) of can be extended continuously along the path 
to a representation of = f as a sum of three squares. 
In view of (2), and certainly of (4) and (5), this proof does not have an 
elementary character. Also note that the existence of a path as in (7) 
is ensured only by the general topological fact (5). There is no concrete 
construction of such a path. 

Our proof uses a variant of (1), plus applications of the implicit function 
theorem similar to (8). Otherwise it proceeds differently. In particular, we 
avoid the non-elementary steps (2), (4) and (5). Like Hilbert we are deform- 
ing representations along paths. Other than in Hilbert's proof, however, our 
paths are completely explicit, and are in fact simply straight line segments. 
Here is a road map: 

(a) By a limit argument (see \3.6\i . it suffices to prove the existence of a 
representation for generic psd /, i.e., for psd / satisfying a condition 
^(/) 7^ where $ is a suitable nonzero polynomial in the coefficients 
of/. 

(b) When the form / has a non-trivial real zero, an elementary and 
constructive proof for the existence of a representation as a sum of 
three squares was given by the first author in [3]. We shall recall it 
in Sect. [2] below. 

(c) Assume that / has no non-trivial real zero. We find a psd form 
/(°) that has a non-trivial real zero such that the half-open interval 

, /] (in the space of all quartic forms) consists of strictly positive 
forms. 

(d) Let (0 < t < 1) denote the forms in the line segment constructed 
in (c), with /W = /. Under generic assumptions on / we show 
that every representation of can be extended continuously to a 
representation of for < t < e, with some e > 0. 

(e) Under further generic assumptions on / we prove for every fixed < 
t < 1 that every representation of / w can be extended continuously 
and uniquely to a representation of for all s sufficiently close to 
t. Both in (d) and (e) we use the theorem on implicit functions. 

(f) Using the limit principle (a), it follows that / = f^ 1 ' has a represen- 
tation as a sum of three squares. 

All our "generic assumptions" on / are explicit. See 19. II for the entire list and 
for a discussion of where they have been used. The exceptional cases that 
we have to exclude are given by the vanishing of invariants that are mostly 
discriminants or resultants of polynomials formed from (the coefficients of) 
/. Two of our invariants are of a more general nature, one of them having 
the amazing degree of 896 in the coefficients of /. 

We believe that we have thus achieved a proof to Hilbert's theorem that 
only uses elementary tools. With only little extra effort, our arguments 
allow in fact to deduce substantial information on the number of essentially 
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distinct representations, at least in generic cases. So far there has been 
no elementary approach to counting representations. Therefore we think it 
worthwile to include these parts. 

Here is an overview of the structure of the paper. We start with the case 
where / has a real zero. By an explicit argument we show that / has a rep- 
resentation as a sum of three squares (Prop. 12. 4ft . Refining the arguments 
yields the precise number of inequivalent representations, under suitable hy- 
potheses of generic nature (Prop. [279]) . In Section [3] we turn to arbitrary psd 
quartic forms /. We show that / can be written as a sum of three squares, 
if and only if there exists a polynomial-valued rational point (with certain 
side conditions) on a certain elliptic curve associated with / (Prop. 13. 3p . 
No background or terminology on elliptic curves is used. Again we refine 
this by a result that permits to count representations (Prop, \3.8\i . Then we 
construct the linear path /W (0 < t < 1) referred to in (d) above and study 
the extension of representations along this path. Extension around t = 
is studied in Section HI around < t < 1 in Sections [6] and [H In between 
we insert two sections that provide the required background on symmetric 
functions. Section [5] has classical material on the discriminant. To handle 
the last case of the extension argument, we need an invariant &(f,g,h) of 
triples of polynomials which is less standard; it is introduced and discussed 
in Section [7J This invariant essentially decides if the pencil spanned by g 
and h contains a member that has a quadratic factor in common with /. 
We do not know whether this invariant has been considered before. Fi- 
nally, in Section [9] we summarize our proof and give a systematic account 
of all the genericity conditions used. We also obtain the precise number of 
representations of / under (explicit) generic assumptions on /. 

Basically, we consider techniques as "elementary" if they are accessible 
using undergraduate mathematics. The most advanced features that we 
use are the theorem on implicit functions and the theorem on symmetric 
functions. Only once (in the proof of Prop. 11.1( b)) are we using slightly 
more advanced algebraic techniques, namely basic facts about Dedekind 
domains. However, this part is only used for counting representations, and 
is not needed for the proof of Hilbert's theorem. 

We believe that our approach to representations as sums of three squares 
is also "constructive", at least in a weak sense. It should be possible to 
follow our deformation argument for constructing such representations with 
arbitrary numeric precision, for example by using finite element methods. 

1. The forms <1, q> 

As usual, a polynomial f(xi, . . . ,x n ) with real coefficients is said to be 
positive semidefinite (or psd for short) if / takes nonnegative values on R n . 
It is said to be positive definite if f(x) > for all x 6 W 1 . When speaking 
of homogeneous polynomials (also called forms), one requires f{x) > only 
for x / (0, . . . , 0), in order to call / positive definite. 
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We shall mostly be working with homogeneous polynomials, except when 
it becomes more convenient to dehomogenize. We start with univariate 
(inhomogeneous) real polynomials. 

Proposition 1.1. Let q £ M[x] be a positive definite polynomial of degree 
two. 

(a) Given any psd polynomial f G M.[x], there are polynomials £, rj € M[x] 
with 

f = V 2 + qf- (1-1) 

(b) Assume that f ^ in (a) satisfies deg(/) = 2d. Then the total 
number of solutions (£, if) to (jl.lh is < 2 d+1 , with equality if and 
only if q \ f and f is square-free. 

For the proof of Hilbert's theorem we only need part (a). The second 
statement will be used in our count of representations. 

Proof. Clearly, q and / may be scaled by any positive real number. By 
changing the generator x of the polynomial ring if necessary, we may there- 
fore assume q = x 2 + 1. 

First assume that / is monic of degree 2, say / = (x + a) 2 + b 2 with real 
numbers a and b. Then £ as in (jl.ip has to be a constant, and we write 
£ 2 = A. Given A£l, the polynomial 

f-Xq = (1 - X) x 2 + 2ax + (a 2 + b 2 - A) 

is a square if and only if either A = 1, a = and b 2 > 1, or else A < 1 and 

(l-X)(a 2 + b 2 -X)-a 2 = (1.2) 

(vanishing of the discriminant of / — Xq). In any case, there is precisely one 
value of A > for which / — Xq is a square: For a = 0, this is A = min{l, b 2 }, 
while for a ^ it is the unique < A < 1 for which (jl.2p vanishes. (Note 
that the left hand side of (|1.2p is positive for A S> 0, is b 2 > for A = 0, and 
is —a 2 < for A = 1.) Hence ^ 2 = A and rj 2 = f — g£ 2 as in (|1 .2j) exist and 
are unique. Note that there are exactly four possibilities for the pair (£,ry), 
except when / or fq is a square. (In these cases there exist precisely two 
possibilities, provided / ^ 0). 

When / is an arbitrary psd polynomial, we can write / as a product 
of quadratic psd polynomials. Using the quadratic case just established, 
together with the multiplication formulae 

(a 2 + b 2 q){c 2 + d 2 q) = (ac ± bdq) 2 + (ad =F be) 2 q , (1.3) 

we conclude that / has a representation (jl.lj) . This proves (a). 

For the proof of (b) we use some basic facts about prime ideal factorization 
in Dedekind domains. Let L = M(x, \/—q), a quadratic extension of the field 
M(x). The integral closure B of M[x] in L is a Dedekind domain. It consists 
of all elements in L whose norm and trace are in M[x], from which we see 
B = R[x, y/— q\. The behaviour of the primes in the extension M[x] C B 
is easy to see: The linear polynomials £ in ~R[x] are unramified in B and 
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remain prime in B, having a quadratic extension of the residue field. The 
monic irreducible quadratic polynomials p ^ q in R[x] are positive definite, 
hence they split into a product p = P1P2 of two primes in B not associated 
to each other, by while the prime q of M.[x] is ramified. Hence B is a 

principal ideal domain. Since rj 2 + q£? = (rj + £y/—q) (rj — CV~l) * s the norm 
of 77 + i^—q in the extension R[x] C -B (for £, 77 G R[#]), the number of 
representations (jl.lft of / is equal to the number of elements in B of norm 
/• 

The norms of the prime elements of B are N{1) = I 2 , N(p\) = N(p2) = p 
and N(y/—q) = q. This shows that the number of elements in B of norm 
/ is obtained as follows: Every factor p m (for p 7^ q quadratic irreducible) 
contributes m + 1 solutions; multiply all these numbers, and multiply the 
result by 2. In other words, the precise number is (for / 7^ 0) 

2\{{l + v p (f)), 
p 

product over the monic irreducible polynomials p 7^ q of degree 2. From this 
the assertion in (b) is clear. □ 

It would be possible to present the arguments for part (b) in a way that 
avoids using any theory of Dedekind rings. However we felt that trying this 
is not worth the effort. 

Later it will be preferable for us to use Prop. fLTl in a homogenized version. 
For convenience we state this version here: 

Corollary 1.2. Let q G R[x,y] be a positive definite quadratic form. Given 
any psd form f G M[x, y] of degree 2d, there exist forms £, rj G M[x, y] with 
deg(£) = d — 1, deg(r/) = d and f = rj 2 + q£ 2 . The number of such pairs 
(£,77) is < 2 d+1 , with equality if and only if q\ f and f is square-free. □ 

2. The case where / has a real zero 

2.1. Let / = f(x,y,z) be a psd quartic form in R[x,y, z], and assume that 
/ = has a nontrivial real zero. Changing coordinates linearly we can 
assume /(0, 0, 1) = 0, hence 

/ = f 2 (x, y)-z 2 + f 3 (x, y)-z + / 4 (x, y) (2.1) 

where fj = fj(x, y) is a binary form of degree j (j = 2, 3, 4). That / is psd 
means that each of the three binary forms 

h-, hi 4/2/4 - /| 

is psd, that is, a sum of two squares. By an argument which is entirely 
elementary and explicit, we shall construct a representation of / as a sum of 
three squares (Proposition 12.4ft . For generically chosen /2, h, h, we shall in 
fact construct all such representations (Proposition 12.9ft . This second part 
is not needed for the proof of Hilbert's theorem. 
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2.2. Let us start by showing that / is a sum of three squares. If fa = 
then also fa = 0, and hence / = fa is a psd binary form, therefore a 
sum of two squares. If ^ fa = I 2 is a square of a linear form, then 
4^ 2 /4 > /| shows Z | fa, say /3 = 2lg2- Observe that fa — g 2 is a sum of two 
squares since 4Z 2 (/4 — g|) = 4/2/4 — /f is a sum of two squares. Therefore 
/ = (Iz + c/2) 2 + (/4 — #2) i s a sum °f three squares. 

2.3. It remains to discuss the case where fa is strictly positive definite. From 
Cor. 11.21 we see that there exist binary forms £ = £(x, y) and rj = rj(x, y) 
with deg(£) = 2, deg(7/) = 3 and rj 2 + £ 2 / 2 = 4/ 2 / 4 - /f, that is, 

r? 2 + /| = / 2 (4/4-e 2 )- (2-2) 
On the other hand, since fa is psd, there are linear forms h, h £ y] with 
fa = I 2 + l\ = (h + ih){h - ih) (i 2 = -!)• By similarly factoring the left 
hand side of (12. 2ft . it follows that Zi + i/2 divides one of rj ± i/3. Replacing 
Z2 by — Z2 if necessary we can assume 

{h+ih) I {v + ifa). 

This implies that / 2 divides (77 + ifa)(h ~ ih) = {vh + fah) + i(fah ~ vh)- 
Hence fa divides both real and imaginary part of the right hand form. So 
the fractions 

, fah ~ vh , vh + fah 
are binary quadratic forms (with real coefficients), and ()2.2[) implies 

,_{v 2 +fmi+ii) _v 2 +n _ t 1 



Moreover 



and so 



u2, h 2 VI T J3AH -r ^ '/ ^ J3 f x t 2 

^ + /l2 = m = ^j 2 - = h -t- 



h j , , , _ / 3 (/ 2 + g) _ 1 , 

ftl£l+/l 2 t2 — — ~/3, 

2/2 2 



£\ 2 , r, , , .,2 , „ , , x 2 



/= It J + ^ + ^ + (/l2 + hZ) 
is a sum of three squares of quadratic forms. We have thus proved: 

Proposition 2.4. Let f £ R[x, y, z] be a psd quartic form which has a 
nontrivial real zero. Then f is a sum of three squares of quadratic forms in 
R[x,y,z}. □ 

Note that the proof was entirely explicit and constructive. 

We now turn to the task of determining all representations of /, at least 
in the case when fa, fa, fa are chosen generically. For this, the following 
definition is useful. 

Definition 2.5. Two representations 

/ = Erf = 
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with quadratic forms pi, p[ G M[x,y,z] are said to be (orthogonally) equiva- 
lent if there exists an orthogonal matrix S = (sij) G 03(M) such that 

3 

Pj = ^SijPi U = 1,2,3). 

i=l 

2.6. Let / = f 2 z 2 + f 3 z + /4 be a psd form as in (|2,ip . We assume that f 2 
is not a square, hence is strictly positive definite. Assume 

3 

/ = Y^{v l z + w i ) 2 (2.3) 

i=l 

where Vi resp. Wi G are homogeneous of respective degrees 1 resp. 2 

(i = 1,2,3). We first show how to associate with (|2.3p a solution ((,,rj) of 

Consider the column vectors v = (v±, v 2 , v 3 Y and w = (wi,w 2 ,w 3 ) t with 
polynomial entries. Since the linear forms V\, v 2 , v 3 are linearly dependent, 
there is an orthogonal matrix S G 03 (M) such that the first entry of the 
column Sv is zero. Replacing v resp. why Sv resp. Sw yields an equivalent 
representation / = Yli=i( v 'i z + W i) 2 m which v[ = 0. So up to replacing (|2.3p 
by an equivalent representation we can assume v\ = 0, and get accordingly 

f 2 = vl + vl, f3 = 2{v 2 w 2 + v 3 w 3 ), U = wl + wl + wl. 

Putting £ := 2w\ and r] := 2(^2^3 — ^3^2) gives 

+ fi = 4(t>2^3 - V3W2) 2 + 4:(v 2 w 2 + v 3 w 3 ) 2 
= A{v 2 2 + v 2 )(w 2 2 +w 2 3 ) 

= / 2 (4/ 4 -e 2 ) 

so (£,7?) solves (j!T2"j) . 

Note that a different choice of S does not change £ 2 and rj 2 . Indeed, 
the first row of S is unique up to a factor ±1 since vi, v 2 , v 3 span the 
space of linear forms in M[x, y\. Therefore ±£ does not change if S is chosen 
differently. The same argument shows that £ 2 and rj 2 depend only on the 
equivalence class of (12. 3p . 

2.7. When f 2 is not a square, note that the number of solutions (£, rj) of 
(|2.2p was determined in Prop. [LTT b). In particular, it was shown there that 
this number is < 16, and is equal to 16 if and only if f 2 \ f 3 and 4/2/4 — /f is 
square-free. In this latter case, the pair (£ 2 ,r/ 2 ) can therefore take precisely 
four different values. 

2.8. Assume that f 2 is not a square, that f 2 \ f 3 and 4/2/4 — /| is square- 
free. We show that inequivalent representations (|2.3h give different solutions 
(£ 2 , rj 2 ) to (12. 2p . Combined with [221 this will imply that / has precisely four 
different representations up to equivalence. 
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Let 

/ = w\ + (v 2 z + w 2 ) 2 + (v 3 z + w 3 ) 2 

= wf + (v' 2 z + w' 2 f + (v' 3 z + w' 3 ) 2 

be two representations with the same invariants £ 2 , that is, with w 2 = w' 2 = 
Then v 2 w 3 — v 3 w 2 = ±(v' 2 w' 3 — v 3 w' 2 ), and we can assume 

^2^3 — ^3^2 = v' 2 w 3 — v 3 w 2 

by multiplying v 2 z + w 2 with —1 if necessary. Writing v = v 2 + iv 3 , w = 
w 2 + iw 3 and v' = v' 2 + iv' 3 , w' = w' 2 + iw' 3 this means Q(vw) = Q(v'w'). On 
the other hand we have 

vv = v'v' = f 2 , M(vw) = Mfflw') = -f 3 , Aww = Aw'vf = 4/4 — £ 2 , 

and we conclude 

v W = v' w '. (2.4) 

Now v does not divide w' , because otherwise vv = f 2 would divide Aw'w' = 
4/4 — £ 2 , and hence we would have 

fl\(il 2 +fi) = (v + *f3)(ri-if 3 ), 

whence f 2 \ f 3 , which was excluded. Comparing the two products (|2.4p we 
see that there exist A, fj, E C with v' = Xv and w' = fj,w, and clearly we 
must have |A| = = 1. Therefore (|2.4p shows A = fi. This means that the 
two representations we started with are equivalent. 

We summarize these discussions: 

Proposition 2.9. Let f = f 2 z 2 + f 3 z + be psd (with fi £ homo- 
geneous of degree i, for i = 2,3,4), and assume that f 2 is not a square. 

(a) Associated with each representation of f as a sum of three squares 
is a well-defined solution of 

v 2 + fl = / 2 (4/ 4 -£ 2 ) 

such that £ 2 and r) 2 depend only on the orthogonal equivalence class 
of the representation. 

(b) If f 2 \ f 3 and 4/2/4 — f 3 is square-free, then any two representations 
of f with the same invariants £ 2 , rj 2 are equivalent. There exist 
precisely four different equivalence classes of representations of f . 

□ 

Remark 2.10. Let / = f 2 z 2 + f 3 z + fi be psd, as in Proposition [2T9J The real 
zero (0, 0, 1) is a singularity of the projective curve / = 0. That f 2 is not 
a square means that this singularity is a node (with two complex conjugate 
tangents). When f 2 \ f 3 and 4/2/4 — /f is square-free, one can show that 
(0,0,1) is the only singularity of the curve (the converse is not true). The 
fact that / has precisely four inequivalent representations is in agreement 
with the results of |10j . 
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3. The case where / has no real zero 
The following normalization lemma was proved in [3]: 

Lemma 3.1. Let f = f(x,y,z) be a strictly positive definite form of degree 
four inW[x,y, z]. Then, by a linear change of coordinates, f can be brought 
into the form 

f = z 4 + f 2 z 2 + f 3 z + f 4 (3.1) 

in which fj G R[x,y] is a form of degree j (j = 2,3,4), and such that the 
form f — z 4 is psd. 

Proof. Let c > be the minimum value taken by / on the unit sphere 
S 2 in R 3 . Scaling / with a positive factor we may assume c = 1, and 
after an orthogonal coordinate change we get c = 1 = /(0,0, 1). The form 
f '■= f — {x 2 + y 2 + z 2 ) 2 is nonnegative on M 3 and vanishes at (0,0,1). 
Therefore / does not contain the term z 4 , in fact deg z (/) < 2. This means 
that / has the shape (|3.ip . The last assertion follows from / — z 4 = / + 
(x 2 + y 2 + z 2 ) 2 -z 4 >f>0. □ 

Remarks 3.2. 1. The form f — z 4 is psd and vanishes in (0,0,1), so the 
results of Sect. [2] apply to / — z 4 . In particular, we can explicitly construct 
a representation of / — z 4 as a sum of three squares. 

2. The minimum value of / on the unit sphere can be found by inspecting 
the solutions of the equation V/(x, y,z) = X ■ (x, y, z) with Ael. 

For / as in (|3.ip we now study the question when / is a sum of three 
squares. 

Proposition 3.3 (@] Prop. 3.1). Let f = z 4 + f 2 z 2 + f 3 z + / 4 where fj G 
is a form of degree j (j = 2, 3, 4). Then f is a sum of three squares 
if, and only if, there exist binary forms £,,r]£ M.[x,y] with deg(£) = 2, 
deg(r/) = 3 and 

r] 2 + f! = (/ 2 - 0(4/4 -f), (3.2) 

such that 

h ~ C > 0, 4/ 4 - i 2 > 0. (3.3) 

Remark 3.4. If one of f 2 — £ and 4/4 — £ 2 is psd, then so is the other by (|3.2|) . 
except possibly in the case where f 2 — £ resp. 4/4 — £ 2 was zero. The latter 
can happen only if f$ = and 77 = 0. Note that the psd conditions in (|3.3p 
mean that the two forms are sums of two squares of linear resp. quadratic 
forms. 

Proof o/ l 3, 31 First assume / = Yli=i( u i z2 + v i z + w i) 2 1 where U{, V{, u>i G 
M[a;,y] are forms of respective degrees 0, 1, 2 (1 < i < 3). The vector 
(ui,u 2 ,u%) G M 3 has unit length, so by changing with an orthogonal real 
3x3 matrix we can get u\ = 1 and u 2 = u 3 = 0. This implies v\ = 0, 
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v 2 + v 3 = h ~ % w i ) 2(v 2 w 2 + V3W3) = f 3 and w\ + w\ = f± — w± . One checks 
that (I3.2j) and (I3.3j) are satisfied with 

£ = 2u>i, 77 = 2(t;2'u;3 - v 3 w 2 ). 

Conversely assume that £, 77 satisfy (|3.2p and (|3.3[) . If £ = /2, then /3 = 0, 
and by (|3.3p there are quadratic forms 102, 7^3 G M[x,y] with ^4 — \f\ = 

w\ + W3, so 

f=z A + f 2 z 2 +f 4 = ^ + ^y+ W \ + 

Now assume £7^/2- By (|3.3|) there are linear forms t;2, 773 G M[x,y] with 
f 2 — £ = v\ + v 2 = (v 2 + 7t;3)(t;2 — w 3 ) (where i 2 = — 1). From (|3.2p we see 
that the linear form t;2 + iv 3 divides one of the two forms 77 ±7/3 (in C[x, y\). 
Replacing v 3 with —v 3 if necessary we can assume (^2 + ^3) | {fj + ifs)- This 
implies that f 2 — £ divides 

{r] + ih){v 2 - iv 3 ) = {f3V 3 + r]v 2 )+i(f 3 v 2 -rjv 3 ). 

Therefore, 

is a sum of three squares in WL[x, y, z\. A comparison of the coefficients shows 
that this sum is equal to /. □ 

Remark 3.5. Consider / = z A + f 2 z 2 + f 3 z + /4 as a monic polynomial in z, 
with coefficients /j G K[x,y] as in Prop. [3731 Equation (|3.2|) says t? 2 = r/(£) 
where 

r f (z) = {f 2 - z ){AU-z 2 )-f 2 
is the cubic resolvent of / with respect to z (see 15.21 below) . 

The following lemma follows from the fact that the sum of squares map 
(pi,p 2 ,P3) YljPj i s topologically proper (see (1) of the introduction). 
Avoiding this argument we give a direct proof based on Prop. 13.31 

Lemma 3.6. Let f^ 2 \ ■■■ be a sequence of quartic forms as in \3.3\ 

which converges coefficient- wise to a form f . If every fV> is a sum of three 
squares, then the same is true for f . 

Proof. For every index j there exist forms £V\ r/-^ G M[x,y] satisfying the 
conditions of Prop. 13.31 From the inequality < 4/4 it follows that 

the sequence £w) i s bounded, and so the sequence 77^') i s bounded as well. 
Hence there exists a limit point (£,77) of the sequence Tj^), and (£,77) 

satisfies the conditions of 13.31 for the form /. □ 

The rest of this section is not needed for our proof of Hilbert's theorem. 
Similar as in the case where / has a real zero (Section [2]), we try to find all 
representations of / as a sum of three squares. 
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Lemma 3.7. Let f be as in Prop. 13.31 The construction in the proof of 
Prop. \3.3\ associates with every representation 

f=p\+P 2 2+pl (3.4) 

a pair (£,17) which solves (j3.2j) and (j3.3[) . The form £ is independent from 
the choices. In fact it depends only on the orthogonal equivalence class of 
the representation (|3.4p . 

Proof. Consider the representation (|3.4p . and write 

Pi = UiZ 2 + V{Z + Wi (i = 1,2,3) 

where U{, Vi, W{ S are homogeneous of respective degrees 0, 1 and 2. 

Writing u = («i,«2,«3)*, v = (v±, i>2, v^Y, w = (wi,W2,Ws) t , we choose S S 
3 (R) with Su = (1,0, 0)* as in the proof of Prop. El If Sv = (v[,v' 2 , v' 3 Y 
and Sw = (uu'i, iv' 2 , w^)* , we have shown that 

£ = 2w[ , rj = 2(v' 2 w 3 - v' 3 w' 2 ) 

solve (13. 2p and (j3.3[) . If T is another orthogonal matrix with Tu = (1, 0, 0)*, 
then T = US where U is orthogonal with first column and row (1,0, 0). This 
shows that using T instead of S does not change £. The same argument 
shows that £ depends only on the orthogonal equivalence class of (13.4j) . □ 

Proposition 3.8. Let f = z A + f 2 z 2 + f%z+ f^ with fj G homogeneous 
of degree j (j = 2, 3, A), and assume gcd(/3, 4/4 — f 2 ) = 1. Let 

f = Erf = E*P 

i=i i=i 

be two representations of f with associated invariants £ and £' (see Lemma 
\3. 7| ). If i = the two representations are orthogonally equivalent. 

Proof. Assuming f 2 — £ 7^ 0, we first show that f%— £ does not divide 4/4 — £ 2 . 
From 

(/a " 0(4/4 - e 2 ) = r? 2 + /| = (r? + i/ 3 )fa - i/ 3 ) 
we see that (jfe — | (4/4 — £ 2 ) would imply (/2 — £) | /3- On the other hand, 
it would imply (/2 — £) | (4/4 — /|), thus contradicting the assumption. 

Write ^ = UjZ 2 + UjZ + u>j and = u^z 2 + t^z + u; ■ (i = 1, 2, 3) as in the 
proof of Lemma [3 .71 We can assume u\ = u' l = 1 and U{ = u[ = for i = 2, 3. 
By hypothesis we have wi = = | and ^2^3 — V3W2 = ±(v' 2 w' 3 — v' 3 w' 2 ); 
replacing p 3 with — p 3 if necessary we can assume 

ti 2 1«3 - ^3^2 = v' 2 W 3 - v' 3 W 2 = - . (3.5) 

Since the coefficient of z 3 vanishes in / we have v± = v[ = 0, and so p\ = 
p[ = z 2 + |. Write v := V2 + i«3, w := + ^3 and similarly v' := w 2 + iv' 3 , 
w' := w' 2 + iw 3 . Then (|3.5p says Q(ot;) = ^(y'w') = ^77. A comparison of 
the other coefficients gives vv = v'v' = /2 — £, 3?(uu)) = 3f?(TT'?z/) = I/3 and 
4ww = Aw'w' = 4/4 — £ 2 . In particular, = v'w' = 2 (f% + in). 
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We clearly have v = 44> v' = 0, and similarly w = 44> w' = 0. In 
either of these cases, it is clear that the two representations are equivalent. 
Hence we can assume v, w ^ 0. Now v does not divide w', because otherwise 
vv | w'w', i.e., (f 2 — £) I (4/4 — £ 2 ), which was ruled out at the beginning. 
So we conclude that there exist A, (i € C with |A| = \(i\ = 1 and v' = Xv, 
w' = (iw. Then vw = v'w' implies A = (i. Hence the two representations 
are orthogonally equivalent. □ 

Corollary 3.9. If gcd(f 3 , 4/4 — /|) = 1 the number of inequivalent repre- 
sentations of f equals the number of forms £ solving (|3.2p and (|3.3p u>ii/i 
suitable n. □ 

4. Deforming the quartic, I 

4.1. Let / = f(x, y, z) be a nonzero psd quartic form with real coefficients. 
We are trying to show that / is a sum of three squares. The case where / 
has a nontrivial real zero has already been solved completely. From now on 
we assume that / is strictly positive definite. We shall use a deformation 
to a suitable psd form with real zero to arrive at the desired conclusion, at 
least in a generic case. 

As in Lemma 13.11 we use scaling by a positive number and an orthogonal 
coordinate change to bring / into the form 

f = z 4 + f 2 (x, y)z 2 + f 3 (x, y)z + / 4 (x, y) (4.1) 

with deg(/j) = j (j = 2, 3, 4), such that the form 

f - z A = f 2 {x,y)z 2 + f 3 (x,y)z + U{x,y) 

is psd. The latter means that each of the binary forms f 2 , fi and 4/2/4 — /f 
is psd. 

4.2. Let t be a real parameter. Fixing / as in 14. 1\ we consider the family 
of quartic forms 

/(*) := t>f+(l-t 2 )(f-z 4 ) 

= t 2 z 4 + f 2 (x, y)z 2 + f 3 (x, y)z + / 4 (x, y) (4.2) 

(t G R). For < \t\ < 1, the form /(*) is strictly positive definite, while 
/(o) = f - z 4 has a zero at (0,0,1). When t runs from to 1, the form 
/w covers the line segment between f — z A and / (inside the space of all 
real quartic forms). Note however that the time parameter is quadratic, not 
linear. 

4.3. Let 7^ t E R. By Prop. 13.31 is a sum of three squares if and only 
if there are forms f, fj € R[x, y] with rf + i~ 4 /f = (f- 2 f 2 - |)(4t" 2 /4 ~ I 2 ), 
with both factors on the right psd. Multiplying with t and substituting 
^ = t^, 77 = t 2 fj, we see that this happens if and only if there are forms £, n 
in R[x, y] (of degrees 2 resp. 3) such that 

r] 2 + f! = (/ 2 -^)(4/ 4 -e 2 ), (4.3) 
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fa-t£>0, 4/ 4 -£ 2 >0. (4.4) 
On the other hand, conditions (|4.3|) . (|4.4p have a solution (£o>?7o) f° r t = 0, 
provided that fa is not a square, since 4/2/4 — /f is then represented by 
<l,/2> (Cor. [E21 see also (JI2D above). The condition 4/ 4 - £ 2 > is 
automatic since fa > and fa ^ 0- Keeping the assumption that fa is not 
a square, let us fix such forms £o> % with deg(£o) = 2, deg(r/o) = 3 and 

Vo + fail = 4/ 2 /4 - fl (4.5) 

Proposition 4.4. In addition to the assumptions in \4-l\ assume that fa 
is not a square, that fa \ fa and that 4/2/4 — /f is square-free. Then there 
exist continuous families (t/*)) (\t\ < e, for some e > 0) of forms such 

that (C (0) ,7? (0) ) = (£o,Vo), and such that (C (t) ,77 (t) ) solves (g^J, (JO]) /or a// 
|i| < e. 

For the proof we need the following simple lemma: 

Lemma 4.5. Let k be afield, let f, g € be polynomials with deg(/) = in, 
deg(o) = n and m, n > 1. T/ie linear map 

fc[t] m _i -> fc[i] m+ „_i, (p,g) h-> pg + qf 

is bijective if and only if f and g are relatively prime. (Here k[t]d denotes 
the space of polynomials of degree < d.) 

Proof. Both the source and the target vector space have the same dimension 
m + n. If / and g are relatively prime, then pg + qf = implies / | p and 
g I q, whence p = q = by degree reasons. The reverse implication is 
obvious. □ 

Note that if one uses the canonical linear bases to describe the map in 
14.51 by a matrix, and takes its determinant, one obtains the resultant of / 
and g. 

Proof of Prop. \4-4\ We first exploit the assumption. The forms £0 an d go are 
relatively prime since the square of any common divisor divides 4/2/4 — /f 
by (|4.5p . Also, the irreducible form fa does not divide 770, since otherwise 
(|4.5p would imply fa | fa. We conclude that fa^o and go are relatively prime. 

Let Vd C M[x, y] denote the space of binary forms of degree d, and consider 
the map 

F: V 2 x V 3 x R -> y 6 , (£, 7/, t) ^ t/ 2 + /f - (/ 2 - t£)(4/ 4 - £ 2 ). 

The partial derivative of F at (£0)^0)0) with respect to (£,77) is the linear 
map 

F 2 V3 -> Vfe, (£, t?) 1 ^ 2(7/07/ - / 2 e 0- 
By Lemma 14.5^ this map is bijective. 

The theorem on implicit functions gives us therefore the existence of con- 
tinuous families (£^), (jj )i f° r 1*1 < e ' and some e' > 0, with (£ (°) , t/ ** ) = 
(£0,%) and with F(^,g^, t) = (that is, (|Oj> ) for |i| < e' . As for con- 
ditions (|4.4p . it suffices to verify the first of them since / 3 7^ 0. For t = 0, 
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fa — t£® = fa is strictly positive definite by assumption. Hence there is 
some e" > such that fa — t£^> > for all \t\ < e" , and we can take 
e = min{e', e"}. □ 

Using Prop. 13.31 we conclude from Prop. 14.41 

Corollary 4.6. Assume that f = z 4 + faz 2 + faz + fi (with fj G and 
deg(fj) = j) is strictly positive definite and satisfies f — z 4 > 0. If fa is not 
a square, fa \ fa and 4/2/4 — /| is square- free, then there exists e > such 
that /(*' is a sum 0/ i/tree squares for all < |t| < e. 

Remarks 4.7. 1. It can be shown that a representation of as a sum of 
three squares can be chosen for every \t\ < e such that the polynomials 
in this representation depend continuously on t, starting at t = with an 
arbitrary representation of f^ = f — z 4 . 

2. Since the map F in the proof of 14.41 is polynomial, a suitable version 
of the implicit function theorem (see [2] 10.2.4, for example) shows that the 
families (?/^) are not just continuous but even analytic. 

5. The discriminant 

Before proceeding to extend representations of f^ over the entire interval 
< t < 1, we need to discuss the discriminant of /w. 

5.1. Here are some reminders about the classical discriminant. Let K be a 
field, let 

/ = ao z n + ai z n ~ l + • • • + an G K[z] 
with do 7^ 0. The discriminant of / is defined as 

disc(/) = disc n (/) = a^"" 2 JJ( ai - aj ) 2 

i<j 

if a%, . . . ,a n are the roots of / in an algebraic closure of K. More precisely, 
this is the n-discriminant of /; if deg(/) = m < n — 1 then disc„(/) = 0, 
while in general disc m (/) 7^ 0. If deg(/) = n then it follows directly from 
the definition that disc n (/) = if and only if / has a multiple root. 

Using the theorem on symmetric functions one sees that disc n (/) is an 
integral polynomial in the coefficients ao, • • • , a n of /. Moreover, there exist 
universal polynomials p, q £ Z[ao, . . . ,a n , z] such that 

disc n (/) = pf + qf 

where /' is the derivative of /. One finds p and q by writing / with in- 
determinate coefficients and performing the Euclidean algorithm on / and 
/'• 

Directly from the definition one sees that the polynomial f(Xz) has dis- 
criminant 

disc n /(Az) = A™*™- 1 ) disc n /(z), (5.1) 
if A G K is a parameter. 
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In the remainder of this section the degree n is always clear from the 
context, and we omit the index n of the discriminant. 

5.2. Given a quartic polynomial 

f(z) = a z 4 + aiz 3 + a 2 z 2 + a 3 z + a 4 , 

the cubic resolvent Tt(z) of f(z) is defined to be the cubic polynomial 

rf(z) = QqZ 3 — a\a 2 z 2 + 00(0103 — 4oo04).z + (4ao02«4 — oo^i — afa^). 

If ao 7^ and a±, . . . , a 4 are the roots of f{z) in an algebraic closure of K, 
a calculation with symmetric polynomials shows that rj(z) has the roots 

/?i = aia2 + a3«4, /3 2 = ai«3 + 0204, ,$3 = ai«4 + a2«3- 

We will only use the case where the cubic coefficient a\ of / vanishes, in 
which 

disc(/) = ao(—4a|a| — 27aoa| + 160204 — 128aoa2 a l + 144aoa2a|a4 +2560904) 
and 

r f {z) = a ((a z - a 2 )(a z 2 - 4o 4 ) - afj. 

Lemma 5.3. Let f = clqz 4 + a\z 3 + a 2 z 2 + 03Z + 04. Then 

disc 77 (z) = Oq disc/(z). 
Proof. If /3i, 2 , /?3 are the roots of r j as in 15.21 then 
01-02 = (011 ~ ai)(a 2 - a 3 ), 

01- 03 = ("i - a 3 )(a 2 - a 4 ), 

02- 03 = («i - a 2 )(a 3 - a 4 ), 
from which one immediately sees 

disc(r / ) = 4 2 J] (/3 k - 0i) 2 = ao 2 II (« 4 -a,) 2 = o[jdisc(/). 

l<fc<Z<3 1<«<7<4 

□ 

Remark 5.4. If ^4 = R, and if a quartic polynomial f(z) = a^z A + 02-z 2 + 
a 3 z + 04 G R[z] with ao 7^ is known to be strictly positive definite, then 
/ can have a multiple root only if / is a square. Therefore, disc(/) = is 
equivalent to a 3 = a 2 — 4aoa 4 = in this case. 

5.5. Now let t 7^ be a real parameter. We consider 

/(*) = tV + f 2 (x,y)z 2 + h(x,y)z + f 4 (x,y) 

as a quartic polynomial in the variable z over R[x, y] (see 02])). Let be 
the cubic resolvent of (with respect to z). We put 

:= ^t«(|) = (^-/2)(^ 2 -4/ 4 )-/ 2 , 

and we define 

D t := discgt(z) GR[x,y]. 
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Using ()5.ip and Lemma 15.31 we find 

D t = disc g t (z) = t- 2 disc /<*> (z). 

Explicitly, this gives 

Dt ■= -4/1/1 " 27t 2 / 3 4 + 16/ 2 4 / 4 - 128£ 2 /f /J + 144t 2 / 2 /|/ 4 + 256t 4 /| 
= 16/ 4 (4t 2 / 4 - /|) 2 + 4/ 2 /f(36t 2 / 4 - /|) - 27t 2 / 3 4 
(a form of degree 12 in x and y). We further put 

/it W = Y z gt[z) = Ztz 2 -2f 2 z-Ath (5.2) 

and conclude: 

Lemma 5.6. Dt lies in the ideal generated by gt and ht in M.[x,y,z]. □ 

6. Deforming the quartic, II: Case of a linear factor 

6.1. For (Glwe continue to consider the form 

/(*) = t 2 z* + f 2 (x,y)z 2 + Mx,y)z + U(x,y), 

see (|4.2p . We know that /(*) is strictly positive definite for < \t\ < 1, and 
that /(*) is a sum of three squares for small \t\ (Prop. I4.4[) . 

Let to 7^ be a fixed real number, and assume that the form /(*°) is strictly 
positive definite and a sum of three squares. Under generic assumptions on 
/ which do not depend on to, we shall show that /W is a sum of three squares 
for all t sufficiently close to to- 

6.2. That /(*>) is a sum of three squares means the following, by 14.31 There 
exist forms £o> Vo £ ^[^,y] with deg(£o) = 2, deg(r/o) = 3 such that 

ril = (/ 2 - *o&)(4/ 4 - fo) " ft = 9to(to), (6-1) 

and 

h - *o£o > 0, 4/4 - Co > 0. (6.2) 
For d > we let again Vd denote the vector space of forms of degree d in 
K[a:,y]. As in the proof of Prop. [4~4l we consider the map F : V 2 x V3XM — > Vq, 

v, t) = v 2 + fi - {h - tmu - e 2 ) = v 2 - gt(0 

(see 15.51 for g t (£)). The partial derivative of F in (£0 , r/o,to) with respect to 
(£,77) is the linear map 

V 2 ® V 3 -> V 6 , (£, 77) ^ 2770 • V - M&) ■ £ 

where 

M£o) = StoCo " 2Mo - 4t / 4 , 

c.f.ESJ 

Proposition 6.3. Assume that the two forms rjo £ V3 and /it (£o) 6 ^4 
are relatively prime, and that fa 7^ 0. TTien t/iere exist e > and solutions 
(€t,Vt) to (USD a^d (SHI / or I* - *o| < e s^c/i t/iat (£<„,%) = (£o,%)- 
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Proof. Indeed, by applying Lemma f4.5l as in the proof of Prop. I4.4( it follows 
from the theorem on implicit functions that there are (£,t,Vt) depending 
continuously (in fact analytically, see 14, 7p on t and satisfying (£ to , r/ to ) = 
(£())%) and F(^ t ,i]t,t) = 0, for \t — to I < ^ (with suitable e' > 0). So 
equations (I4.3P hold for \t — to\ < e' . We claim that conditions (|4.4p hold as 
well for suitable < e < e'. Indeed, since fa ^ 0, this is clear if fa — io£o is 
strictly positive, see Remark 13.41 If fa — to£o is a square, it could a priori 
happen that the quadratic form fa — t£t is indefinite for t arbitrarily close to 
to, say with real zeros at < (it- However, since (fa — ££t)(4/4 — £ 2 ) = /f + r/ 2 , 
this would imply that at and /?t are roots of fa for all these t, which is 
evidently impossible. □ 

6.4. It remains to show, under suitable generic assumptions on f, that the 
following is true: 

For every real number t ^ such that /W is positive definite, 
and for every solution (£,,rj) of (|4.3j) and (|4.4p . the two forms 
r\ and ht(£) are relatively prime. 
To analyze the problem, assume that rj and ht(£) have a nontrivial common 
divisor p = p(x,y) in R[x,y]. We can assume that p is irreducible, hence 
(homogeneous) of degree one or two. By (|4.3p . rj 2 = gt(£), and so p divides 
g t (0 as well. 

Below we will treat the case where p is linear. The quadratic case will be 
dealt with in Sect. [8j 

6.5. So assume that t ^ and is positive definite, and p is a linear 
common divisor of gt(Q and /it(£) in Let us denote equivalence in 
M[x,y] modulo the principal ideal (p) by =. By Lemma EU D t lies in the 
ideal generated by gt(£) and ht(£). We conclude that D t = 0. 

Since f^ is strictly positive definite, and since disc/W = t 2 -Dt, Remark 
EU implies fa = 4t 2 / 4 - /| = 0. Since p 2 divides 

&(0 = (/2-t0(4/4-e 2 )-/ 3 2 , 

and since both factors /2 — t£ and 4/4 — £ 2 are psd, we conclude that p 2 
divides / 2 - t£ or 4/ 4 - £ 2 . From t 2 (4/ 4 - £ 2 ) = (/f - i 2 £ 2 ) + (4t 2 / 4 - /£) 
we see that in fact p 2 divides 4/4 — £ 2 unconditionally, and that p divides 
fi ~ t 2 £ 2 - So we have 

v = u = fi- t 2 e = o, p 2 1 (4/4 - a. (6.3) 

From f 2 — i 2 £ 2 = we see that one of the two conditions fa ± t£ = holds. 
When fa — t£ = 0, this implies p 2 \ (fa — t£) since fa — t£ is psd, and so the 
right hand side of 

r] 2 + f! = (fa-t0(4h-e) 
is divisible by p 4 . This implies p 2 \ fa, and so fa is not square- free, which is 
a non-generic situation. When fa + t£ = 0, we combine this with 4/4 = £ 2 
to get 

= h t (0 = 3tf - 2 fat - At fa = (3 + 2 - l)^ 2 = 4^ 2 . 
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This gives £ = 0, and hence f± = 0, whence {fz-,fij / 1- Again this is a 
non-generic situation. 

7. Quadratic common divisors in pencils of polynomials 

Proposition 7.1. Fix m, n > 2 and consider triples (f,g,h) of univari- 
ate polynomials with deg(/) < m and deg(g), deg(h) < n. There exists a 
nonzero integral polynomial ^ m ,n(f 9 , h) in the coefficients of f, g, h with 
the following property: 

For any field k and any polynomials f,g,h£ k[x] with deg(/) < m and 
deg(g), deg(h) < n, if there exists (0,0) / (s,t) G k 2 with 

deggcd(/, sg + th) > 2, 

then ^ m ,n{f'9,h) = 0. 

Proof. Let k be algebraically closed and / G k[x\. Assume deg(/) = m, let 
a m be the roots of /, and assume that the on are pairwise distinct, 
i.e., that / is separable. Given g and h, there exists (s,t) / (0,0) with 
deggcd(/, sg + th) > 2 if and only if there exist 1 < i < j < m such that 

sg(ai) + th{cti) = sg{a.j) + th(a.j) = 

for some (s, t) / (0,0), or equivalently, such that 

g(ai)h(aj) = g(a j )h(a i ). 

So this holds if and only if 

lit g h) ■= TT 9{ai)h{ aj ) - g{ aj )h{ ai ) 

l<i<j<m J 

vanishes. It is easy to see that (f> is invariant under all permutations of 
the roots aj. Hence when / is monic, <fi is an integral polynomial in the 
coefficients of /, g and h. To cover the non-monic case as well, observe that 
4> has degree < (m — l)(n — 1) with respect to each ojj. Therefore, if a® 
denotes the leading coefficient of /, it follows that 

g ^ . = ("»-i)(n-i) . "Q gjajMaj) - g(g j )h(a i ) 

l<i<j<m ai- aj 

is an integral polynomial in the coefficients of /, g and h. 

From 4>(f, x, 1) = 1 for monic / of degree m we see that (ft does not vanish 
identically. To prove the proposition it suffices to put 

Vm,n(f'9,h) ■= disc m (/) • <f>(f,g, h). 

□ 

Definition 7.2. For polynomials f,g,h£ k[x] with deg(/) < m and deg(g), 
deg(h) < n, we define the ^-invariant by 

* mn (f,g,h) := at~ 1)in ' 1} - TT ^)h{a 3 ) - g{a 3 )h{ ai ) 

l<i<3<m J 
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where ati,..., a m are the roots of / and ao is the coefficient of x m in /. By 
the proof of Proposition 17.11 Q m ,n{f,g,h) is an integral polynomial in the 
coefficients of /, g and h. 

The proof of Proposition 17.11 has shown: 

Corollary 7.3. In \7.1\ we can take 

^m,n(f,g,h) = disc m (/) • $ m>n (f,g,h). 

If f is separable with deg(/) > m — 1, then $ m ,n{f,9,h) = is equivalent to 

the existence of a pair (0, 0) ^ (s, t) € k with sg + th = or deg gcd(/, sg + 
th) > 2. □ 

Remarks 7.4. 1. The power of oo in the definition of $ min is the correct one, 
in the sense that <& m ,n is not divisible by ao- Indeed, if / = YliLo a i xm ~ % ■> 
and if one takes g := x n ~ 1 {box + &i), h := x n_1 (cox + ci), one finds 

*mM9,h) = a^-^-V -(bo^-hcap). 

2. Write / = ZT=o W m ~\ g = E"=o bjX^ and h = E"=o 9,-x n ~ j '■ As a 
polynomial in the Oj, bj and cy, $ mjn is homogeneous of degree (m— l)(ra— 1) 
in the dj and of degree (™) in the bj and in the Cj . If we give degree i to Oj 
and degree j to bj and Cj, then $ mjri is jointly homogeneous in all variables 
of degree (™)(2ra-l). 

3. The ^-invariant has some relations with resultants. For example, the 
rule 

$m,n+d{f,Pg,ph) = res mid (/,p) m_1 • <Z> m ,n{f,g,h) 
holds, for deg(/) < m, deg(p) < d and deg(g), deg(/i) < n. 

Example 7.5. Let a«, bj, Cj be the coefficients of /, g, h as before. In low 
degrees it is quite manageable to calculate $ explicitly. For example we have 



$2,2(/,9,&) =det{f,g,h) 



ao b co 
a\ b\ ci 
a 2 62 C2 



or 

^2,3(/, g, h) = alb 2 c 3 - a a 2 b 2 c 1 + a 1 a 2 b 2 c - a Q aibic 3 + a\ b c 3 
— aoa 2 boC3 + a 2 boCi — 096302 + aoa 2 bic 2 — a\a 2 boc 2 
+ a aib 3 ci - afb 3 co + a a 2 b 3 c - ajhcQ. 

As the remarks on the degree of Q m ,n show, the size of & m ,n grows quickly 
with m and n. 

We do not know whether & m ,n(f, AS h) or some related invariant has been 
considered before. 
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8. Deforming the quadric, III: Case of a quadratic factor 
As before we write 



9t(0 = if - f 2 f - + fi 



6 



where / 6 := 4/2/4 - fh and 



M£) = JU(£) = 3tf-2M-4tf 4 . 

The hardest step in our proof is to show, for generically chosen /j, that gt(0 
and /if(£) have no common quadratic factor, whenever (£, n) is a solution of 
(|4.3p and t 7^ 0. This will be accomplished by the following result: 



Proposition 8.1. Consider triples (f 2 ,f3,fi) of forms in M[x,y] (with 
deg(/j) = i for i = 2, 3, 4J /or which 

V 2 = (/2 - tf)(4/ 4 - e 2 ) - /I = <?*(£) (8-1) 

/ias a solution (£, 77) /or some / t £ 1 suc/i i/iai gt(£) and /iai>e 
a common irreducible quadratic factor. Then these triples are not Zariski 
dense. 

In other words, there exists a nonzero polynomial ^ = ^(f 2 , fz, /i) in the 
coefficients of f 2 , fs and /4 which vanishes on the triples described in the 
proposition. 

The plan of the proof is as follows. We will successively deduce six "excep- 
tional" conditions on (f 2 , fz, fi), labelled (Si)-(Sq). We will show that, for 
generic choice of the /j, none of these conditions holds. On the other hand, 
we'll show that the assumptions of 18. II imply that at least one of (Si)-(Sq) 
is satisfied. 

8.2. We dehomogenize all forms in y] by setting y = 1. So f 2 , fs, / 4 , 
£, r] are polynomials in R[x] with deg(/j) < i (i = 2,3,4), deg(£) < 2 and 
deg(r/) < 3. We assume that t 7^ is a real number and identity (|8.1j) holds, 
and that p 6 is an irreducible quadratic polynomial with p 2 \ gt(£,) and 
p I /ii(£). Denoting congruences modulo (p) in M[x] by =, we therefore have 

te - f 2 e - 4tM +k = (/ 2 - te)(4/ 4 - e) -fi = o (8.2) 

and 

3tf - 2M ~ 4t/ 4 = 0. (8.3) 
Combining (|8.2p and (|8.3p we get 

M 2 + 8tM ~ 3/ 6 = 0, (8.4) 
and eliminating t from (|8.3p and (|8.4p we find 

m 4 - (8/2/4 - 3/De 2 + 4/4/6 = 0. (8.5) 
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8.3. We use ' to denote the derivative on polynomials in M[cc]. From 
p 2 | g t (£) we see that p divides g t (0' = - (f£ 2 + 4i/# - ft), and 
hence 

tif + tofo-fe = 0. (8.6) 
From ht(£) = and (j8.6j) we can again eliminate t and get 

3/^ 4 + (8/2/4" - 4/^/4 - 3/^)£ 2 + 4/ 4 $ = 0. (8.7) 

8.4. For i, j G {2,3,4} we put 

gij ■■= ififj-jfjfi = fifs-fo M/j/r^)- 

Note that deg(9ij) < i + j — 2, with equality for generic choice of the /j. We 
observe the relation 

2/2534 - 3/3524 + 4/4923 = 0. 

8.5. We now eliminate £. From (|8.5j) and (18. 7ft we can eliminate £ 4 and get 

(2/2924 - 3/3923K 2 - 4/ 4 (2/ 252 4 " /392 3 ) = 0. (8.8) 
We can eliminate t from (18.41) and (18.61) . getting 

92 4 £ 2 + 2(/ 3 934 - 2/4924) = 0. (8.9) 
Finally we can eliminate £ from (|8,8p and (|8.9p . getting 

/ 3 2 • (923934 - 9 2 2 4 ) = 0. (8.10) 

8.6. We introduce the following "exceptional" conditions (S1)-(S3). Clearly, 
none of them holds for generically chosen /2, fs, f^: 

(Si) gcd(/ 3 ,/ 4 )/l, 
(£2) gcd(9 23 , 9 24 ) ^ 1, 

(53) gcd(9 34 , 924) ^ 1- 

8.7. We show that = leads to an exceptional case. Assume that (Si) 
is excluded and / 3 = 0. From we get (f 2 - «;)( 4 /4 - £ 2 ) = 0, hence 

/ 2 - *e = or 4/4 - e 2 = 0. 

/2 = together with (|8.3p . gives 4/4 = £ 2 since i 7^ 0. Conversely, 4/4 = £ 2 
and ([8~4"P imply / 4 (/ 2 - if) = 0, and / 4 ^ since gcd(/ 3 ,/ 4 ) = 1. So we 
see that = / 2 — t£ = 4/4 — f 2 = hold in any case, and therefore 
also fs = /| — 4t 2 /4 = 0. In particular, there exists a scalar A such that 
deggcd(/ 3 , /| + A/ 4 ) > 2. By Proposition O this means we are in the 
following exceptional case: 

(54) *3,4(/3, fl h) = 0. 

8.8. Excluding (Si) and (£4) we have /3 ^ 0, and therefore get 

923934 - 924 = ( 8 - n ) 

from (|8.10p . The assumption 924 = leads to one of (S2) or (S3). Excluding 
those we have in addition 924 ^ 0. 
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8.9. We finally assume that (Si)-(S4) are excluded, so (|8.1ip holds with 
524 ^ 0. We show that this again leads to an exceptional case. Multiply 
(|8.9p with ^23) rewrite using (18. lip and cancel the factor 524 to get 

923t 2 + 2 f 3924 ~ 4/4523 = 0. (8.12) 
Multiply ([SID with 523 and use (f8TT2|) to obtain 

8t/452 3 £ = (8/2/4 -3/|)523 + 2/2/3524. 
Squaring this congruence and using (|8. 12|) once more we finally get 

128t 2 / 4 2 52 3 (2/4523 -/3524)- ((8/2/4 -3/|)523 + 2/2/3524)' = 0. (8.13) 

8.10. Consider 

P ■= 523534 - 524) 

Q ■= /|P23 (2/4523 - /3524), 

R ■= (8/2/4 - 3/|)523 + 2/2/3524. 

These are integral polynomials in the coefficients of /2, fs, fi- For generically 
chosen /j we have deg(P) = 8, deg(Q) = 18 and deg(i?) = 9. We have shown 
that the assumption in Proposition 18.11 leads either to one of (<Si)-(S4), or 
to the existence of a pair (A,/i) 7^ (0,0) of scalars with deggcd(P, XQ + 
fiR 2 ) > 2. By Proposition 17.11 and Corollary 17.31 the latter implies one of 
the following two conditions: 

(5 5 ) disc 8 (P) = 0; 

(5 6 ) $ 8> i 8 (P, Q, R 2 ) = 0. 

8.11. We still need to show that 

* 8>18 (P,Q,R 2 ) = disc 8 (P)-$ 8 ,i 8 (P,Q,i? 2 ) ^ 

for generically chosen /j. Clearly it suffices to exhibit a single triple (/2, /3, fi) 
where this number is nonzero. Unfortunately, it seems hard to do this by 
hand alone, due to the enormous size of the polynomial $. With the help 
of a computer algebra program, there is no difficulty: If we take 

f 2 = x 2 - x + 1, h = x 2 - 1, h = x A + 1, 

then 

P = .923534 - .924 = - 24a;8 + 60x 7 - 64x 5 + 56x 4 - 20x 3 - 144a; 2 + 88x - 16 

is separable, and 3>8,i8(-P> Q, R 2 ) is an integer with 372 digits that has the 
prime factorization 

- 2 713 • 3 33 • 179 • 233 • 641 • 1531 • 4093 • 11273 • 29983 7 • 342841 14 • 66617977107707 

Remark 8.12. We can consider 3?8,i8(-P> Q, R 2 ) as an integral polynomial in 
the coefficients of /2, fz, f&. To find the degree of this polynomial, note that 
$8,18 Qi R 2 ) is homogeneous of degree 7 • 17 = 119 in the coefficients of 
P, and homogeneous of degree ( 2 J = 28 in the coefficients of Q and in those 
of R 2 (Remark El 2). Given that P (resp. Q, resp. R 2 ) is homogeneous of 
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degree 4 (resp. 7, resp. 8) in (f 2 , fa, f^), we conclude that $8,18 Qi R 2 ) is 
homogeneous of degree 

119 -4 + 28 -7 + 28 -8 = 896 
in (the coefficients of) f 2 , f% and f^. 

Remark 8.13. The invariant <&8,i8(-P; Q; -R 2 ) is enormous not only by its de- 
gree, but also in terms of the values it produces. If the fa have small integral 
coefficients, then <3?(P, Q,R 2 ) typically has several hundreds of digits. 

Based on the factorization of this invariant in several sample cases with 
integer coefficients, we suspect that the form &(P,Q, R 2 ) (of degree 896 in 
the coefficients of f 2 , f 3 and fi) decomposes as a product of smaller degree 
forms. 

9. Summary and complements 

9.1. Let 

/ = z A + f 2 z 2 + hz + U (9-1) 

where fi G y] is homogeneous of degree i (i = 2, 3, 4) and f 2 , /4, 4/2/4 — 
/| are psd. In the course of our proof of Hilbert's theorem we have considered 
the following exceptional cases: 

(E 1 ) disc 2 (/ 2 ) = 0, 

(E2) f 2 I /?, 

(E 3 ) disc 6 (4/ 2 / 4 - /|) = 0, 

(£4) disc 3 (/ 3 ) = 0, 

(E 5 ) gcd(/ 3 ,/ 4 )/l, 

{E%) gcd(g(23,524) / 1, 

(E 7 ) gcd(fi(24,934) / 1, 

(E 8 ) $3,4(/3,/2 2 ,/4)=0, 

(Eg) disc 8 (523534 - #24) = 0, 

(£7 10 ) ^is^g,^ 2 ) = 0. 

(Note that the conditions gcd 7^ 1 can be rephrased as the vanishing of suit- 
able resultants.) For counting inequivalent representations, we also needed 
to consider the following condition: 

(E u ) gcd(/ 3 , 4/ 4 -/ 2 2 )/l. 

Let us summarize the role of these exceptional cases. For every real number 
t we considered the equation 

c t : v 2 + fi = (/ 2 -t0(4/ 4 -e 2 ) 

with the side conditions f 2 — t£ > and 4/4 — £ 2 > 0. 

We had to exclude (E\) to ensure that Co has a solution (£0, ??o) (for t = 0, 
Cor.[L2D. 

We had to exclude (E 2 ), (E3) to extend any solution of Co to a solution 
of C t for small \t\ (Prop. IPl). 
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We had to exclude fs = (which is contained in (£4)), and had to 
assume gcd(<7j(£), ht(C)) = 1 for all < \t\ < 1 and all solutions of Ct, 

to extend a solution of Ct for < \t\ < 1 into a neighborhood of t (see I6.3p . 

We had to exclude (£4) and (E§) to exclude a linear common divisor of 
gt(C) and h t (£) (see^M- 

We had to exclude (i?5)-(-Eio) to exclude an irreducible quadratic common 
divisor of gt((,) and h t ((,) (see Sect. El these conditions were labelled (S\)- 
(Sq) there). 

9.2. We have proved: If the quartic form 

f = z 4 + f 2 z 2 + hz + U 

is strictly positive definite with f — z 4 > 0, and if / is sufficiently generic, 
then any solution of Co (for t = 0) can be extended in a unique continuous 
way to a solution of Ct , for < t < 1. Here "sufficiently generic" means that 
/ avoids the exceptional cases (Ei)~(Eiq). For i = 1, . . . , 10, there exists a 
nonzero polynomial in (the coefficients of) / such that ^i(f) 7^ if and 
only if / avoids (E{). Clearly, the set of strictly positive definite forms / 
with Yl i=1 ^i(f) 7^ is dense in the space of all psd forms of shape (|9.2p . 
By 13.61 it follows that any psd form (|9.2p is a sum of three squares. 

Example 9.3. An explicit example of a positive definite form / which is 
"sufficiently generic" is 

/ = z 4 + {x 2 - xy + y 2 )z 2 + (x 2 - y 2 )yz + (x 4 + y 4 ). 

That is, / avoids all exceptional conditions (Ei)-(En). (See 18.111 for (Eg) 
and (E10); the other conditions are readily checked except possibly (Eg), 
which is avoided since $3,4(73, /f, fi) = 56.) 

9.4. Along our proof of Hilbert's theorem, we needed only little extra effort 
to obtain partial information on the number of inequivalent representations 
of a psd form / as a sum of three squares. (See Definition 12 . 51 for the meaning 
of equivalence of representations.) Let us review and complete these results: 

Theorem 9.5. Let f be a psd form. 

(a) When f has a real zero and is otherwise sufficiently generic, then f 
has precisely 4 inequivalent representations. 

(b) When f is strictly positive and sufficiently generic, then f has pre- 
cisely 8 inequivalent representations. 

Here, "sufficiently generic" means in (a) that f avoids (E\)-(E^), assuming 
/(0, 0, 1) = 0. In (b) it means that f avoids (Ei)-(En) if f is normalized 
into the form f = z 4 + /2Z 2 + fez + f$ with f — z 4 > 0. 

Proof, (a) was proved in Prop. 12.91 For the proof of (b) assume that / 
is normalized as above (Lemma I3.1H . and consider the linear pencil as 
in (|4.2j) . When / avoids (Ei)-(Eiq), we have proved that we can extend 
every solution (£o>?7o) of Co (at time t = 0) along this pencil to a solution 
(£,77) of C\ (at time t = 1), and that locally this extension is everywhere 
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unique. Hence, for t = 1 there are at least as many solutions (£, rf) as for 
t = 0, namely 16 (see 12.7ft . If we also exclude {E\\), then Corollary 13.91 
shows that for / (i.e., for t = 1) these 16 pairs correspond to precisely 

8 inequivalent representations. In order to show that there are no further 
representations of /, we need to show that for t — > the solutions 
of Cj remain bounded, and thus converge to solutions for t = 0. But this is 
obvious since we have 4/ 4 - (£W) 2 > for all t. □ 

Remark 9.6. These findings are in agreement with the results of [7] and 
|10j . As far as we know, this is the first time that results on the number of 
inequivalent representations have been obtained by elementary methods. 
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